Goto

Collaborating Authors

 mutual information maximization



Graph Contrastive Learning with Augmentations (Appendix) Yuning You

Neural Information Processing Systems

Superpixel graphs (statistics in Table S1) gain from all augmentations except attribute masking as shown in Figure S1. D Difficulty of Contrastive T asks v.s. Pairing "Identical" stands for a no-augmentation baseline for contrastive The baseline training-from-scratch accuracy is 79.71%. Performance on contrastive learning with different implemented subgraph. For subgraph, we propose the following variants with difficulty levels.


First Contact: Unsupervised Human-Machine Co-Adaptation via Mutual Information Maximization

Neural Information Processing Systems

How can we train an assistive human-machine interface (e.g., an electromyography-based limb prosthesis) to translate a user's raw command signals into the actions of a robot or computer when there is no prior mapping, we cannot ask the user for supervision in the form of action labels or reward feedback, and we do not have prior knowledge of the tasks the user is trying to accomplish? The key idea in this paper is that, regardless of the task, when an interface is more intuitive, the user's commands are less noisy. We formalize this idea as a completely unsupervised objective for optimizing interfaces: the mutual information between the user's command signals and the induced state transitions in the environment. To evaluate whether this mutual information score can distinguish between effective and ineffective interfaces, we conduct a large-scale observational study on 540K examples of users operating various keyboard and eye gaze interfaces for typing, controlling simulated robots, and playing video games. The results show that our mutual information scores are predictive of the ground-truth task completion metrics in a variety of domains, with an average Spearman's rank correlation of 0.43. In addition to offline evaluation of existing interfaces, we use our unsupervised objective to learn an interface from scratch: we randomly initialize the interface, have the user attempt to perform their desired tasks using the interface, measure the mutual information score, and update the interface to maximize mutual information through reinforcement learning. We evaluate our method through a small-scale user study with 12 participants who perform a 2D cursor control task using a perturbed mouse, and an experiment with one expert user playing the Lunar Lander game using hand gestures captured by a webcam. The results show that we can learn an interface from scratch, without any user supervision or prior knowledge of tasks, with less than 30 minutes of human-in-the-loop training.


MIM4DD: Mutual Information Maximization for Dataset Distillation

Neural Information Processing Systems

Dataset distillation (DD) aims to synthesize a small dataset whose test performance is comparable to a full dataset using the same model. State-of-the-art (SoTA) methods optimize synthetic datasets primarily by matching heuristic indicators extracted from two networks: one from real data and one from synthetic data (see Fig.1, Left), such as gradients and training trajectories. DD is essentially a compression problem that emphasizes on maximizing the preservation of information contained in the data. We argue that well-defined metrics which measure the amount of shared information between variables in information theory are necessary for success measurement, but are never considered by previous works. Thus, we introduce mutual information (MI) as the metric to quantify the shared information between the synthetic and the real datasets, and devise MIM4DD numerically maximizing the MI via a newly designed optimizable objective within a contrastive learning framework to update the synthetic dataset. Specifically, we designate the samples in different datasets who share the same labels as positive pairs, and vice versa negative pairs. Then we respectively pull and push those samples in positive and negative pairs into contrastive space via minimizing NCE loss. As a result, the targeted MI can be transformed into a lower bound represented by feature maps of samples, which is numerically feasible. Experiment results show that MIM4DD can be implemented as an add-on module to existing SoTA DD methods.


Morphology-Aware KOA Classification: Integrating Graph Priors with Vision Models

Tliba, Marouane, Kerkouri, Mohamed Amine, Nasser, Yassine, Aburaed, Nour, Chetouani, Aladine, Bagci, Ulas, Jennane, Rachid

arXiv.org Artificial Intelligence

Knee osteoarthritis (KOA) diagnosis from radiographs remains challenging due to the subtle morphological details that standard deep learning models struggle to capture effectively. We propose a novel multimodal framework that combines anatomical structure with radiographic features by integrating a morphological graph representation - derived from Segment Anything Model (SAM) segmentations - with a vision encoder. Our approach enforces alignment between geometry-informed graph embeddings and radiographic features through mutual information maximization, significantly improving KOA classification accuracy. By constructing graphs from anatomical features, we introduce explicit morphological priors that mirror clinical assessment criteria, enriching the feature space and enhancing the model's inductive bias. Experiments on the Osteoarthritis Initiative dataset demonstrate that our approach surpasses single-modality baselines by up to 10\% in accuracy (reaching nearly 80\%), while outperforming existing state-of-the-art methods by 8\% in accuracy and 11\% in F1 score. These results underscore the critical importance of incorporating anatomical structure into radiographic analysis for accurate KOA severity grading.


MIM4DD: Mutual Information Maximization for Dataset Distillation Y uzhang Shang

Neural Information Processing Systems

More details can be found in [10]. It allows us to transform the target problem at the data level (Eq. Given that each layer's mapping The ConvNet comprises three consecutive blocks of'Conv-InstNorm-ReLU-AvgPool.' The training is stopped after 5,000 iterations. To test the ConvNet's performance on the The network's initial learning rate is 0.01.



Graph Contrastive Learning with Augmentations (Appendix) Yuning You

Neural Information Processing Systems

Superpixel graphs (statistics in Table S1) gain from all augmentations except attribute masking as shown in Figure S1. D Difficulty of Contrastive T asks v.s. Pairing "Identical" stands for a no-augmentation baseline for contrastive The baseline training-from-scratch accuracy is 79.71%. Performance on contrastive learning with different implemented subgraph. For subgraph, we propose the following variants with difficulty levels.


Learning Text Styles: A Study on Transfer, Attribution, and Verification

Hu, Zhiqiang

arXiv.org Artificial Intelligence

This thesis advances the computational understanding and manipulation of text styles through three interconnected pillars: (1) Text Style Transfer (TST), which alters stylistic properties (e.g., sentiment, formality) while preserving content; (2)Authorship Attribution (AA), identifying the author of a text via stylistic fingerprints; and (3) Authorship V erification (A V), determining whether two texts share the same authorship. We address critical challenges in these areas by leveraging parameter-efficient adaptation of large language models (LLMs), contrastive disentanglement of stylistic features, and instruction-based fine-tuning for explainable verification. First, for TST, we conduct a comprehensive survey and reproducibility study of 19 state-of-the-art algorithms, establishing benchmarks across diverse datasets. Building on these insights, we introduce LLM-Adapters, a unified framework for parameter-efficient fine-tuning (PEFT) that enables cost-effective adaptation of LLMs for style-centric tasks. This culminates in Adapter-TST, a novel architecture that models multiple stylistic attributes (e.g., sentiment, tense) using lightweight neural adapters. Adapter-TST achieves superior performance in multi-attribute transfer and compositional editing while reducing computational costs by 80% compared to full fine-tuning. For AA, we propose ContrastDistAA, a contrastive learning framework that disentangles content and style features to address performance degradation under topic shifts. Our method advances both individual-level attribution and regional linguistic analysis, achieving state-of-the-art accuracy by isolating culturally influenced stylistic patterns.


InfoPO: On Mutual Information Maximization for Large Language Model Alignment

Xiao, Teng, Ge, Zhen, Sanghavi, Sujay, Wang, Tian, Katz-Samuels, Julian, Versage, Marc, Cui, Qingjun, Chilimbi, Trishul

arXiv.org Artificial Intelligence

We study the post-training of large language models (LLMs) with human preference data. Recently, direct preference optimization and its variants have shown considerable promise in aligning language models, eliminating the need for reward models and online sampling. Despite these benefits, these methods rely on explicit assumptions about the Bradley-Terry (BT) model, which makes them prone to overfitting and results in suboptimal performance, particularly on reasoning-heavy tasks. To address these challenges, we propose a principled preference fine-tuning algorithm called InfoPO, which effectively and efficiently aligns large language models using preference data. InfoPO eliminates the reliance on the BT model and prevents the likelihood of the chosen response from decreasing. Extensive experiments confirm that InfoPO consistently outperforms established baselines on widely used open benchmarks, particularly in reasoning tasks.